feat(ai): apply code improvements to AudioToText pipeline #4

rickstaa · 2024-07-14T14:39:07Z

@eliteprox here are some small code improvements I found.

rickstaa · 2024-07-14T14:40:26Z

common/util.go

@@ -75,6 +75,9 @@ var (
 	ErrProfEncoder  = fmt.Errorf("unknown VideoProfile encoder for protobufs")
 	ErrProfName     = fmt.Errorf("unknown VideoProfile profile name")

+	ErrUnsupportedAudioFormat   = fmt.Errorf("audio format unsupported")


@eliteprox this is a very usefull error which should be implemented but currently is not used. I tried inputting the following unsupported audio OGG filetype:

25eb6f27.zip

But it threw a audio duration calculation error. Can we implement a audio file type unsupported error?

Checking file extensions isn't super reliable and ffmpeg has some edge cases where it returns 0 duration for supported file types, so we currently just check if the duration is <= 0.

The status code returned from GetCodecInfoBytes (which currently ignored) should indicate something that indicates the file format is unsupported, although I don't know exactly what the code would be. However I am unsure why the API returns both a code and an error; ideally it should be one or the other.

For what it is worth in this specific case, ffmpeg should be able to detect the duration for this file. Ogg is probably just not compiled into in the production LPMS build, which has a slimmed-down version of ffmpeg. Adding the ogg demuxer to install_ffmpeg.sh might be enough for this case.

This commit applies several code improvements to the AudioToText codebase.

…ements

rickstaa commented Jul 14, 2024

View reviewed changes

rickstaa force-pushed the add-speech-to-text-code-improvements branch from 3e88726 to 5104725 Compare July 14, 2024 14:42

feat(ai): apply code improvements to AudioToText pipeline

cb4360b

This commit applies several code improvements to the AudioToText codebase.

rickstaa force-pushed the add-speech-to-text-code-improvements branch from 5104725 to cb4360b Compare July 14, 2024 14:48

Merge branch 'add-speech-to-text' into add-speech-to-text-code-improv…

e307c70

…ements

eliteprox merged commit d40d41b into add-speech-to-text Jul 15, 2024
2 of 9 checks passed

rickstaa deleted the add-speech-to-text-code-improvements branch July 15, 2024 17:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai): apply code improvements to AudioToText pipeline #4

feat(ai): apply code improvements to AudioToText pipeline #4

rickstaa commented Jul 14, 2024

rickstaa Jul 14, 2024

eliteprox Jul 15, 2024

j0sh Jul 15, 2024

feat(ai): apply code improvements to AudioToText pipeline #4

feat(ai): apply code improvements to AudioToText pipeline #4

Conversation

rickstaa commented Jul 14, 2024

rickstaa Jul 14, 2024

Choose a reason for hiding this comment

eliteprox Jul 15, 2024

Choose a reason for hiding this comment

j0sh Jul 15, 2024

Choose a reason for hiding this comment